brown box
ROSO: Improving Robotic Policy Inference via Synthetic Observations
Miyashita, Yusuke, Gahtidis, Dimitris, La, Colin, Rabinowicz, Jeremy, Leitner, Jurgen
In this paper, we propose the use of generative artificial intelligence (AI) to improve zero-shot performance of a pre-trained policy by altering observations during inference. Modern robotic systems, powered by advanced neural networks, have demonstrated remarkable capabilities on pre-trained tasks. However, generalizing and adapting to new objects and environments is challenging, and fine-tuning visuomotor policies is time-consuming. To overcome these issues we propose Robotic Policy Inference via Synthetic Observations (ROSO). ROSO uses stable diffusion to pre-process a robot's observation of novel objects during inference time to fit within its distribution of observations of the pre-trained policies. This novel paradigm allows us to transfer learned knowledge from known tasks to previously unseen scenarios, enhancing the robot's adaptability without requiring lengthy fine-tuning. Our experiments show that incorporating generative AI into robotic inference significantly improves successful outcomes, finishing up to 57% of tasks otherwise unsuccessful with the pre-trained policy.
Compositional Foundation Models for Hierarchical Planning
Ajay, Anurag, Han, Seungwook, Du, Yilun, Li, Shuang, Gupta, Abhi, Jaakkola, Tommi, Tenenbaum, Josh, Kaelbling, Leslie, Srivastava, Akash, Agrawal, Pulkit
To make effective decisions in novel environments with long-horizon goals, it is crucial to engage in hierarchical reasoning across spatial and temporal scales. This entails planning abstract subgoal sequences, visually reasoning about the underlying plans, and executing actions in accordance with the devised plan through visual-motor control. We propose Compositional Foundation Models for Hierarchical Planning (HiP), a foundation model which leverages multiple expert foundation model trained on language, vision and action data individually jointly together to solve long-horizon tasks. We use a large language model to construct symbolic plans that are grounded in the environment through a large video diffusion model. Generated video plans are then grounded to visual-motor control, through an inverse dynamics model that infers actions from generated videos. To enable effective reasoning within this hierarchy, we enforce consistency between the models via iterative refinement. We illustrate the efficacy and adaptability of our approach in three different long-horizon table-top manipulation tasks.
- North America > United States > Massachusetts > Middlesex County > Cambridge (0.04)
- Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.04)
Programmatically Grounded, Compositionally Generalizable Robotic Manipulation
Wang, Renhao, Mao, Jiayuan, Hsu, Joy, Zhao, Hang, Wu, Jiajun, Gao, Yang
Robots operating in the real world require both rich manipulation skills as well as the ability to semantically reason about when to apply those skills. Towards this goal, recent works have integrated semantic representations from large-scale pretrained vision-language (VL) models into manipulation models, imparting them with more general reasoning capabilities. However, we show that the conventional pretraining-finetuning pipeline for integrating such representations entangles the learning of domain-specific action information and domain-general visual information, leading to less data-efficient training and poor generalization to unseen objects and tasks. To this end, we propose ProgramPort, a modular approach to better leverage pretrained VL models by exploiting the syntactic and semantic structures of language instructions. Our framework uses a semantic parser to recover an executable program, composed of functional modules grounded on vision and action across different modalities. Each functional module is realized as a combination of deterministic computation and learnable neural networks. Program execution produces parameters to general manipulation primitives for a robotic end-effector. The entire modular network can be trained with end-to-end imitation learning objectives. Experiments show that our model successfully disentangles action and perception, translating to improved zero-shot and compositional generalization in a variety of manipulation behaviors. Project webpage at: \url{https://progport.github.io}.
Flawed AI makes robots racist, sexist
The work, led by Johns Hopkins University, Georgia Institute of Technology, and University of Washington researchers, is believed to be the first to show that robots loaded with an accepted and widely-used model operate with significant gender and racial biases. The work is set to be presented and published this week at the 2022 Conference on Fairness, Accountability, and Transparency. "The robot has learned toxic stereotypes through these flawed neural network models," said author Andrew Hundt, a postdoctoral fellow at Georgia Tech who co-conducted the work as a PhD student working in Johns Hopkins' Computational Interaction and Robotics Laboratory. "We're at risk of creating a generation of racist and sexist robots, but people and organizations have decided it's OK to create these products without addressing the issues." Those building artificial intelligence models to recognize humans and objects often turn to vast datasets available for free on the Internet.
AI Is Learning Human Biases: Robot's Racist And Sexist Behaviour Shocks Researchers
'Everything a creator builds is in their own image' - a sentiment we've been fed since forever might actually be true. A robot recently shocked scientists after it became racist and sexist. While such deplorable behaviour is commonly observed among humans, we had better hopes from artificial intelligence. If you expected AI to be impartial and intellectually superior, that's clearly not the case. A recent experiment by researchers from John Hopkins University, Georgia Institute of Technology, and the University of Washington showed how a robot controlled by a machine learning tool began to categorise people based on dangerous stereotypes about race and gender.
Fears AI may create sexist bigots as test learns 'toxic stereotypes'
Fears have been raised about the future of artificial intelligence after a robot was found to have learned'toxic stereotypes' from the internet. The machine showed significant gender and racial biases, including gravitating toward men over women and white people over people of colour during tests by scientists. It also jumped to conclusions about peoples' jobs after a glance at their face. 'The robot has learned toxic stereotypes through these flawed neural network models,' said author Andrew Hundt, a postdoctoral fellow at Georgia Tech who co-conducted the work as a PhD student working in Johns Hopkins' Computational Interaction and Robotics Laboratory in Baltimore, Maryland. 'We're at risk of creating a generation of racist and sexist robots but people and organisations have decided it's OK to create these products without addressing the issues.'
Robots turn racist and sexist with flawed AI, study finds: Neural networks built from biased Internet data teach robots to enact toxic stereotypes
The work, led by Johns Hopkins University, Georgia Institute of Technology, and University of Washington researchers, is believed to be the first to show that robots loaded with an accepted and widely-used model operate with significant gender and racial biases. The work is set to be presented and published this week at the 2022 Conference on Fairness, Accountability, and Transparency (ACM FAccT). "The robot has learned toxic stereotypes through these flawed neural network models," said author Andrew Hundt, a postdoctoral fellow at Georgia Tech who co-conducted the work as a PhD student working in Johns Hopkins' Computational Interaction and Robotics Laboratory. "We're at risk of creating a generation of racist and sexist robots but people and organizations have decided it's OK to create these products without addressing the issues." Those building artificial intelligence models to recognize humans and objects often turn to vast datasets available for free on the Internet.